o 

(N 

u 

Oh! 

< 

in 

(N 



c/3 

Q 



> 

in 

m 

m 

o 



s 



A MATHEMATICAL ANALYSIS OF THE LEAST SQUARES 
SENSITIVITY METHOD 

QIQI WANG * 

Abstract. For a parameterized hyperbolic system Uj+i = f(ui, s), the derivative of an ergodic 
average (J) = lim — J^? J(ui, s) to the parameter s can be computed via the least squares sensitivity 

n->oo n ■ 

method. This method solves a constrained least squares problem and computes an approximation to 
the desired derivative \ ' from the solution. This paper proves that as the size of the least squares 
problem approaches infinity, the computed approximation converges to the true derivative. 



Key words. Sensitivity analysis, linear response, least squares sensitivity, hyperbolic attractor, 
chaos, statistical average, ergodicity 

AMS subject classifications. 

1. Introduction. Consider a family of C 1 bijection maps f(u, s) : R m xl — > R m 
parameterized by s 6 R. We are also given a C 1 function J(u, s) : R m x R — >• R. We 
assume that the system is ergodic, i.e., the infinite time average 

1 " 
(J) = lim - Y~* J(v,i, s) , where u i+ i = f(ui,s), i = 1, ... (1.1) 

n— 7-oc fi L — * 
1=1 

depends on s but does not depend on the initial state uq- The least squares sensitivity 
method attempts to compute its derivative via 

Theorem LSS. Under ergodicity and hyperbolicity assumptions (details in Sec- 
tion^), 



d(J) 



ds n— >oo n 



n 

lim - V(W( U „ S )) v\ n} + (d s J( Ui , a)) 



(1.2) 



» 



where v] G R m , i = 1, . . . , n is i/ie solution to the constrained least squares problem 

n 
min - £ „<">%<**> s . t „W = p /(tt4| s)) „{»} + (9s/(Ui; s)) (L3) 

i=l 

i = l,. . . ,n — 1. Here the linearized operators are defined as 

imi \\ (-n t\i \ v J(u + ev,s)- J(u,s) 
(DJ(u, s)) v := (D v J){u, s) := hm 

e-s-0 e 



(nti \\ rr> t\i \ v f(u + ev,s)-f(u,s) 

(Df{u, s)) v := (D v f)(u, s) := hm 

(g s J(y)): = lim J( ^ + £) ' J( "- s) 
e->0 e 



(1.4) 



(d s f{u,s)) := hm 

(DJ), (d s J), (Df) and (9 S /) are a 1 x m matrix, a scalar, an m x m matrix and an 
ttt, x 1 matrix, respectively, representing the partial derivatives. 



*Department of Aeronautics and Astronautics, MIT, 77 Mass Ave, Cambridge, MA 02139, USA 

1 



2 Q. WANG 

Computation of the derivative d{J)/ds represents a class of important problems 
in computational science and engineering. Many applications involve simulation of 
nonlinear dynamical systems that exhibit chaos. Examples include weather and cli- 
mate, turbulent combustion, nuclear reactor physics, plasma dynamics in fusion, and 
multi-body problems in molecular dynamics. The quantities that are to be predicted 
(the so-called quantities of interest) are often time averages or expected values (J). 
Derivatives of these quantities of interests to parameters are required in applications 
including 

• Numerical optimization. The derivative of the objective function (J) with 
respect to the design, parameterized by s, is used by gradient-based algorithm 
to efficiently optimize in high dimensional design spaces. 

• Uncertainty quantification. The derivative of the quantities (J) with 
respect to the sources of uncertainties s can be used to assess the error and 
uncertainty in the computed (J). 

Traditional transient sensitivity analysis methods fail to compute d(J)/ds in 
chaotic systems. These methods focus on linearizing initial value problems to ob- 
tain the derivative of the quantities of interest. When the quantity of interest is a 
long-time average in a chaotic system, the derivative of this average does not equal 
the long time average of the derivative. As a result, traditional adjoint methods fail, 
and the root of this failure is the ill-conditioning of initial value problems of chaotic 
systems [6]. 

The differentiability of (J) has been shown by Ruelle [5]. Ruelle also constructed 
a formula of the derivative. However, Ruelle's formula is difficult to compute numer- 
ically [HI H]. Abramov and Majda are successful in computing the derivative based 
on the fluctuation dissipation theorem [T]. However, for systems whose SRB measure 
[13] deviates strongly from Gaussian, fluctuation dissipation theorem based methods 
can be inaccurate. Several more recent methods have been developed for computing 
this derivative |10 [ 111 ) [2] I12j. In particular, the least squares sensitivity method |12) 
is a simple method that computes the derivative of (J) efficiently. 

This paper provides theoretical foundation for the least squares sensitivity method 
by proving Theorem (|LSS[) for uniformly hyperbolic maps. Section [2] lays out the ba- 
sic assumptions, and introduces hyperbolicity for readers who are not familiar with 
this concept. Section [3] then proves a special version of the classic structural stability 
result, and defines the shadowing direction, a key concept used in our proof. Section 
@] demonstrates that the derivative of (J) can be computed through the shadowing 
direction. Section [5] then shows that the least squares sensitivity method is an ap- 
proximation of the shadowing direction. Section [6] finally proves Theorem ILSSI by 
showing that the approximation of the shadowing direction makes a vanishing error 
in the computed derivative of (J). 

2. Uniform hyperbolicity. This section consider a dynamical system governed 

by 

Ui+i = f(ui,s) (2.1) 

with a parameter set, where m G R m and / : R m x K -s- M m is C 1 and bijective 
in u. This paper studies perturbation of s around a nominal value. Without loss of 
generality, we assume the nominal value of s to be 0. We denote f^°'(u,s) = u and 
/( J+1 )(u,s) = /«(/(u,s),s) for all i e Z. 

We assume that the map has a compact, global, uniformly hyperbolic attractor 
A C R m at ,s = 0, satisfying 
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1. For all u 6 R m , dist(A,f^(u ,0)) .2=^ where dist is the Euclidean 
distance in M. m . 

2. There is a C E (0, oo) and A E (0,1), such that for all u E A, there is a 
splitting of K m representing the space of perturbations around u. 

R rn = V + (u)®V-(u) , (2.2) 

where the subspaces are 

• V+(u) := {w E R m : ||(D/W(u,0))«|| < CA- l |M| ,Vi < 0} is the 
unstable subspace at u, where || • || is the Euclidean norm in R m , and 

(Df<Hu,s))v--=to* flHu + eV ' a) - fli){U ' a) 

=(Af (i_1) (/(«, a), *))(£>/(«,*))« 

• V~(u) := {v E R m : \\(Df^(u,0))v\\ < C X 1 \\v\\ ,Vz > 0} is the sta&Ze 
subspace at w. 

Both U + (m) and V _ (u) are continuous with respect to u. 
It can be shown that the subspaces V + {u) and V~(u) are invariant under the 
differential of the map (Df), i.e., if u' — f(u, 0) and v' = (Df(u, 0)) v, then [9] 

ce^M^u'el'/fii'), »€F;(ti)^»'eF,"(ii'). (2.3) 

Uniformly hyperbolic chaotic dynamical systems are known as "ideal chaos" . Be- 
cause of its relative simplicity, studies of hyperbolic chaos has generated enormous 
insight into the properties of chaotic dynamical systems [5] . Although most dynam- 
ical systems encountered in science and engineering are not uniformly hyperbolic, 
many of them are classified as quasi-hyperbolic. These systems, including the famous 
Lorenz system, have global properties similar to those of uniformly hyperbolic sys- 
tems [3] . Results obtained on uniformly hyperbolic systems can often be generalized 
to quasi-hyperbolic ones. Scholars believe that very complex dynamical systems like 
turbulence are quasi- hyperbolic []. Although this paper focuses on proving the con- 
vergence of the least squares sensitivity method for uniformly hyperbolic systems, 
is has been shown numerically that this method also works when the system is not 
uniformly hyperbolic |12j . 

3. Structural stability and the shadowing direction. The hyperbolic struc- 
ture (|2.2p ensures the structurally stability of the attractor A under perturbation in 
s. Here we prove a specialized version of the structural stability result. 

Theorem 1. If Ii2.2\) holds and f is continuously differentiable, then for all 
sequence {u®,i E Z} E A satisfying w° +1 = f(u®,0), there is a S > such that 
for all \s\ < S there is a unique sequence {uf,i E Z} E K m satisfying ||uf — u®\\ < 
6 and u s i+1 — f(uf,s) for all i 6 Z. Furthermore, uf is i-uniformly continuously 
differentiable to s. 

Note: i-uniformly continuous differentiability of uf means Vs E (—5, S) and 



> : 36 : Is' - s\ < 6 



< e for all i. Other than the 



duf I dul 

ds \ s ds 

uniformly continuous differentiability of uf, this theorem can be obtained directly 
from the shadowing lemma[7]. However, the uniformly continuous differentiability 
result requires a more in-depth proof. A more general version of this result has been 
proved by Ruelle[5]. 
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To prove the theorem, we denote u = {ui,i G Z}. The norm 

||u[[s = sup|[«i|| (3.1) 

defines a Banach space B of uniformly bounded sequences in K m . Dchnc the map 
F : B x R — > £> as F(u, s) = {uj — /(iii_i, s), i 6 Z}. We use the implicit function 
theorem to complete the proof, which requires -F to be differentiable and its derivative 
to be non-singular at u°. 

Lemma 2. Under the conditions of Theorem^ F has Frechet derivative at all 
ueB: 

(DF(u,s))v = {vi-(Df(ui-i,s))vi-i}, where v = {vi} 



Proof. Because ||u||g = supj ||uj|| < oo, we can find C > 2\\v,i\\ for all i. Because 
/ G C 1 , its derivative (Df) is uniformly continuous in the compact set {u : ||u|| < C}. 
For ||v||b < C/2, we apply the mean value theorem to obtain 

f(uj + v u s) - f(uj, s) {Dfjuj, s)) Vi _ (Dfjuj + guj, s)) - (Dfjuj, s)) 
\W\\b ||v||s ||v|| s 

where < £ < 1. Because \\v.i + £vi\\ < \\v,i\\ + \\vi\\ < C for all i, uniform continuity 
of (Df) implies that Ve > 0, 36 such that for all sup \\vi\\ < 6, 



(Df(u l +^v l ,s))-(Df(u l ,s)) 

7^, V, 



<\\Df(u i + ^v i ,s))-(Df(u i ,s)\\<e 

l v llB 

for all i. Therefore, 

F(u + v, s) - F(u, s) ( Vi /(wj-i + Vi-i,s) - f(ui-i,s) 



v ll8 l|V||B 

{vi - (Df(v,i-i,s))vi-i} 



in the B norm. Now we only need to show that the linear map {vi} — > {vi — 
(Df(v,i-i,s))vi-i} is bounded. This is because (Df) is continuous, thus it is uni- 
formly bounded in the compact set {u : \\u\\ < C}. Denote the bound in this compact 
set as \\(Df)\\ < A, then \\{ Vi - (Df( Ui -i,s)) «i-i}|| s < (1 + A) \\{v t }\\ B . Q 

Lemma 3. Under conditions of Theorem^ the Frechet derivative of F at u° and 
s = is a bijection. 

Proof. The Frechet derivative of F at u° and s = is 

(DF(u°,0))v = {vi-(Df(uti,0))vi-i} 

We only need to show that for every r = {r,} € B, there exists a unique v = {v.i} G B 
such that Vi — (D/(u°_ 1; 0)) Uj_i = r t for all i. 

Because of (|2.2p . we can first split n = r 2 + + rT~, where rf G V + (ui) and r^ G 
V~(ui). Because V + (u) and V~(u) are continuous to m and A is compact, 

\\r+ +r-\\ 

«ga max r+, r 
r ± ey ± («) 
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(This is because if (3 = 0, then by the continuity of V + (u), V~(u) and the com- 
pactness of {(u,r+,r _ ) G A x K m x R m : max(||r+||, ||r~||) = l}, there must be a 
u G A, r + G V + (u),r" G V~(u) such that max(||r + ||, ||r _ ||) = 1 and r + + r~ = 0, 
which contradicts to the hyperbolicity assumption (J2.2P - ). Therefore, 

max(||rt||,||rr||)<M<M» for aU » 

Now let 



Vi = J2(Df^( Ul )rr_ 3 £ (Df(-i\ Ui ) 



rf 



' i+'J ' 
3=0 j=l 

It can be verified that Vi — (Df(u°_ 1 ,0))vi^i — n, and by the definition of V + (u) 
and V~(u), 

oo oo 

INI^ElW^^r^l+glCiJ/^KtiiJri,. 

(3.2) 

Therefore, Vi is uniformly bounded for all i. Thus v G B. 

Because of linearity, uniqueness of v such that Vi — (Df(u°_ 1 ,0)) Vi-i — Ti only 
need to be shown for r = 0. To show this, we split Vi = vf + v~ where vf G V + {ui) 
and v~ G V~(ui). Because the spaces V + (ui) and V~(ui) are invariant (Equation 



= r« = («,+ - (DfiuUMvti) + (v~ - (Df&i-iMvr-i) 

where the two parentheses are in V + (ui) and V~(iii), respectively. Because V + (ui) n 
V~ (ui) = {0}, both parentheses in the equation above must be for all i, and 

vf = (DfiuUMvU = ■■■ = (Df^Hu° r 0)v+ . . 

lor all i > 7 . 
»r = {DfiuUMv^ = ... = (U/(-^(«5,0)w7 

By the definition of V+( Ul ) and V"-(«i), ||w+|| < CA^'||v+||, \\v^\\ < CX^WvjW- If 
vj~ y^ for some j, then 

^>lk + ll>^ll<ll for all i>j, 
and {«i,i G Z} is unbounded. Similarly, if ur ^ for some i, then 

^>IK-||>^IK-|| for all i<f, 

and {vi,i G Z} is unbounded. Therefore, for {v^} to be bounded, we must have 
Vi = vj~ + v~ — for all i. This proves the uniqueness of v for r = 0. D 

Proof. [Proof of TheoremHJ] F(u°,0) = {u° - f(u^_ t ,0)} = 0. So u° is a zero 
point of F at s = 0. Combining this with the two lemma above enables application of 
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the implicit function theorem. Thus there exists S > such that for all \s\ < 5 there 
is a unique u s = {uf} satisfying ||u s — u°||g < S and F(u s ,s) = 0. Furthermore, u s 
is continuously differentiable to s, i.e., ^jr- G B is continuous with respect to s in the 

B norm. By the definition of derivatives (in B and in M m ), Qfc- = < -%*- >. Continuity 

of ^jj- in B then implies that -p- is i-uniformly continuous with respect to s. □ 

Theorem Q] states that for a series {u°} satisfying the governing equation (|2.ip at 
s = 0, there is a series {uf } satisfying the governing equation at nearby values of s. In 

addition, «f shadows u?, i.e., u? is close to u? when s is close to 0. Also, < -p-\ . > 

% lit i I as I s=0 I 

exists and is i-uniformly bounded. 

Definition 4. The shadowing direction v} 00 is defined as the uniformly 
bounded series 



v 



{oo} ._ 



(duj 



R-H2 



du s 

ds 



=o 



GB, 



where u| is defined by Theorem^ 

The shadowing direction is the direction in which the shadowing series uf moves as 



s increases from 0. It provides a vehicle by which we prove Theorcm lLSSI We show that 
the derivative of the ergodic mean (J) to s can be obtained if the shadowing direction 
v i was given (Section EJ . We then show that v, \ , the solution to the constrained 
least squares problem (ll.3|) . sufficiently approximates the shadowing direction vj°° 
when n is large (Section [SJ- We finaly show in Section [5]) that the same derivative 
can be obtained from the least squares solution v\ n '. 

4. Ergodic mean derivative via the shadowing direction. This section 
proves an easier version of Theorcm lLSSI that replaces the solution to the constrained 
least squares problem v\ n ,i — 1, ... , n by the shadowing direction vj°° — -^-\ „. 



Theorem 5. If \2. ty) holds and f is continuously differentiable, For all continu- 
ously differentiable function J(u, s) : K m xI->M whose infinite time average 

1 " 
(J):= lim -^J(/W(uo, S ), S ) (4.1) 

n— s-oo n *-^ 
i=l 

is independent of the initial state uq G M. m , let {vj°° ,i G Z} be the sequence of 
shadowing direction in Definition^ then 



d(J) 



ds 



1 n 
lim - £ ({DJ{ul 0)K {oo} + (d s J(ul 0))) , (4.2) 

n->oo n * — ' V / 



.-II ■"-"" n i=1 



Proof. This proof is essentially an exchange of limits through uniform convergence. 
Because (J) in Equation (|4.1[) independent of tto, we set uq = Uq in Theorem [T] (thus 
f( l \uf),s) — uf) and obtain 



d(J) 



ds 
Denote 



= Um (J)\s=s ~ (J)\s=o = Um Um 1 y J(ui,s)-J(ulO) 



= dJiuls)_ = {DJ{<s)) dul + {dsJ{<s)) 
as as 
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and use the mean value theorem, we obtain 

d{J) 



ds 



lim lim —7^7, , where all |£j(s)| < 

s— >0 n— >oo n *-~* 

i—1 



Because J is continuously diffcrcntiable, we can choose a compact neighborhood of 
A x {0} C M. m x R in which both (DJ(u, s)) and (d s J(u, s)) are uniformly continuous. 
When s is sufficiently small, this neighborhood of A x {0} contains (wf,s) for all i 
because u® £ A and uf are «-uniformly continuously diffcrcntiable (from Theorem 

du s 
[Tl) and therefore are i-uniformly continuous. Also, —r 1 - are z-uniformly continuous. 

ds 
Therefore, for all e > 0, there exists 8 > 0, such that for all |£| < 8, 

lbf-T?||<e V*. 
Therefore, for all \s\ < 8, \£i(s)\ < \s\ < 5 for all z, thus for all n > 0, 

1 n 1 n 1 n 



(=i 



i=l 



thus, 



Therefore, 



n 1 n 

i — ^no T7 * ■ r). — i-rvi T? * * 



n— »oo 77, 



n— J-oo 77 



7i 



< e . 



< € 



d(J) 



(is 



= hm hm — } 7, = lim — > 7 . 

s =o j=l i=l 

This competes the proof via the definition of 7° and v}°° . D 

With Theorem [5j we are one step away from the main theorem (Theorem ILSSI - 
the shadowing direction vj°° in Theorem [5] needs to be replaced by the solution v} n 
to the least squares problems (If .31) . The next section proves a bound of the distance 
between v\°° and v\ n . 

5. Computational approximation of shadowing direction. This section 
assumes all conditions of Theorem [TJ and focus on when s = 0. We denote u® by m 
in this section and the next section. 

The main task of this section is providing a bound for 



e„- = i> 



,{oo} 



* = 1, 



.,71 



,{«} 



where v} is the solution to the least squares problems 



1 ™ 

fin^4 n}T 4 n} s.t. ^;> = (c/K,o))^ } + (a s /K,o)), i = i, 



(5.1) 



,71 — 1. 



(5-2) 
This bound will enable us to show that the difference between vj n and vj°° makes 
a vanishing difference in Equation (|4.2[) as n — > 00. 

Lemma 6. e\ n as defined in Equation H5.1)) satisfy 



2 £> = (£/( Wi ,0))ej n} , i = l, 



,71 — 1 



(5.3) 
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In addition, their components in the stable and unstable directions, e| G V + {ui) 
and e\ n £ V~(ui), where e\ + e\ n — e\ n , satisfies 

e%\ + = (Df(u l ,0))e { l n}+ , eg" = (D/(u„0)) ej n} ~ , i=l,...,n-l (5.4) 



Proof. By definition, uf +1 = f(uf,s) for all s in a neighborhood of 0. By taking 
derivative to s on both sides, we obtain 



vjg = (Df{ui,0))vl°° } + (d a f(ui,Q)) 



Subtracting this from the constraint in Equation (|5.2[) . we obtain Equation (|5.3[) . 
By substituting e\ n = e} + e\ n into Equation (|5.3|) . we obtain 

( e M+ - (*>/(«,, 0)) ef l}+ ) + («£>- - (!>/(«,, 0)) 4" } -) = 

Because the spaces V + (u) and V^ _ (u) are invariant (Equation (12.31) ). 

(Df(u t ,0))e { t n}± 6 ^(Ui+i), thus (eg* - (D/(« i ,0))e i {n}± ) 6 ^(ui+i) . 

Because they sum to 0, both parentheses must be in V + (u i+ i) n V _ (w,+i) = {0}. 
This proves Equation (|5.4|) . D 

Lemma |6] indicates that for all e + and e~ , 



/{n} {«} . + {n}+ _ {n}- 

satisfies the constraint in Problem (|5.2|) . i.e., 

»£> = (z>/(«i, o)) «; {n} + (&/(««, 0)), i = 1, . . . ,n - 1 

Because i> is the solution to Problem (|5.2p . it must be true that 



(5.5) 



E{n}T in} , V^ '{™} T '{«} 4- n+ J - 

v i v i ; < /^ ^ *V i° r ah e T and e . 



By substituting the definition of i>,- in Equation (|5.5p , and use the first order optimality 
condition with respect to e + and e~ at e + = e~ = 0, we obtain 



E{n}T {n}+ \T^ {n}T {n}- n ,- ~s 



By substituting v\ n ' = «,} + e} n ' = v\°°' + e\ n ' + e\ n ' into Equation (|5.6p . we 
obtain 

1=1 i=l i=\ , _s 

n n n V ' / 

E/ {oo}\T {«■}- , W {n} + \T {"}- i W M — \T {"}- n 

W ) e i +2^(eJ ) e l + ZJ e * ) e i =° 

i= 1 i= 1 z— 1 
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>}+ 



>}- 



we need the following 



To transform Equation (|5.7[) into bounds on e\ n and e\ 
lemma. 

Lemma 7. The hyperbolic splitting of e\ n as defined in Equation Ii5.1\) satisfies 



4 n}+ \\ < C A n -iei n >+| 



»-| 



< CAle 



W-i 



Proof. This is a direct consequence of Equation (J5.4I) and the definition of V + 
and V~ in Equation (j2T2"j) . D 

By combining the first equality in Equation (J5.7I) with Lemma [7] and using the 
Cauchy-Schwarz inequality, we obtain 



|eM+|| 2 < B4" }+ ) T e, W+ = -EK { °° } ) Te " + - E^Te 



< 



E 



k, { °° } |||| e |" }+ l 



E 



{°°}\Tg{™}+ _ y > (- {n}-^T {n}+ 
1 i=l 

iuW-iiii^W+ii 



'IllleJ 



i=l 



Therefore, 



<E^^II^ {oo} lll!4" }+ ll + E c2A "ll4" 

+ nC 2 A"|l4" } -|| 



}-HI| e W+| 



;i ,l}+ ll < 



c 

1~ A 



,{oo} 



where the £> norm is as defined in Section [J2 and is finite by Theorem [TJ Similarly, 
By combining the second equality in Equation (15.71) with Lemma [7J and using the 
Cauchy-Schwarz inequality, 

C 



|e ° "-1-A 



,{°°} 



+ nC 2 A"||eW+|| 



When n is sufficiently large such that nC 2 A™ < i, we can substitute both inequalities 
into each other and obtain 



Ik 



{"}+! 



< 



2C_ 
i-A 



,{oo} 



Il4 n} -||< 



2C 
1-A 



,{oo} 



(5.8) 



» 



This inequality leads to the following theorem that bounds the norm of e\ , the 
difference between the least squares solution v] and the shadowing direction. 

Theorem 8. If n is sufficiently large such that 3nCX n < 1, then e] as defined 
in Equation i5. 1\) satisfies 



| e M,| < j^L 
'' " 1-A 



,{°°} 



(A^+A"-*), i = l,...,n 



Proof. From the hyperbolicity assumption (12.21) and Lemma 

||e|" } || < ||ef l}+ || + || e f l} -|| < CA-l e M+|| +CA*||eS n >-|| 

The theorem is then obtained by substituting Equation (J5.8I) into ||en II and ||eo 
in the inequality above. D 

This theorem shows that t£ is a good approximation of the shadowing direction 

v] 00 when n is large and — log A « i < n + log A. The next section shows that the 
approximation has a vanishing error in Equation (|1.2j) as n — > 00. Combined with 
Theorem [SJ we then prove a rigorous statement of Theorem ILSS1 
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6. Convergence of least Squares sensitivity. This section uses the results 
of the previous sections to prove our main theorem. 

Theorem LSS. For a C 1 map f : R m xl4 R m , assume /(-,0) is bijective and 
defines a compact global hyperbolic attractor A. For a C 1 map J : M. m x R — > WL whose 
infinite time average (J) as defined in Equation 0771) is independent of the initial 
state uq G R m . Then for all sequence {ui, i S Z} C A satisfying in+\ = f{ui, 0), 



d(J) 



ds 



s=0 



1 n 
lim -y2((DJ( Ui ,0))vj n} + (dsJ(ui,0))) , 



(6.1) 



,{"} 



, i = 1, . . . , n is the solution to the constrained least squares problem 



where 

Proof. Because J is C 1 and A is compact, (DJ(ui,0)) is uniformly bounded, 
i.e., there exists A such that \\(DJ(v,i, 0))|| < A for all i. Let ej be defined as in 
Equation (j5.ll) . whose norm is bounded by Theorem [51 then for large enough n, 

- £ ((DJ(u u 0)) vj n} + (d s J(u h 0))) - - £ ((DJ(ul 0))vt } + (d s J(ul 0)) 

i=l i=l 

1 n 1 n 



1^2. 



2,4 C 2 



Therefore, 



A°°} 



/^i >»-n 1 4AC 2 

B n (1 - A) 2 



r {°°} 



^0 



(^J(«i,0)) 



1 " 

n— >oo 77 A — • \ 
i=l 

1 " 

= lim -V((DJ(« i ,0))t; i {oo} + (a s J(«i ) 0)) 

i=l 



d(J) 



(/.S 



s=0 



by Theorem [5j D 

7. The least squares sensitivity algorithm. A practicible algorithm based 



on Theorem ILSSI is the following. 

1. Choose large enough no and n, and an arbitrary starting point u_„ £ R m . 

2. Compute itj + i = /(itj, s), i = — no, . . . , 0, 1, . . . , n. 

For large enough no, u%, . . . , u n are approximately on the global attractor A. 

3. Solve the system of linear equations 

v i+ i = (Df(m, s)) Vi + (d s f(ui, s)), % = 1, . . . , n - 1 
w t _i = (Df(ui,s)) T w i+ i +Vi, i = l,...,n 

w i = w„ ,i=0 

which is the first order optimality condition of the constrained least squares 
problem (|1.3[) . and gives its unique solution vi,...,v n . Note that a linear 



relation between 



W t _i,W i+ i 



and 



can be obtained by substituting the 



second equation into the first one. A block tridiagonal solver can then be 
used to solve the system. 
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4. Compute the desired derivative by 

^ « - V ((DJ(«i,0)) Vi + (d s J( Ui , 0))) . 
as n z — ' 

i— 1 

Theorem ILSSI shows that the computed derivative is accurate for large n. 
This algorithm is implemented in the Python code lssmap, available at 
https://github.com/qiqi/lssmap 
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